In [1]:

    
import math
import numpy as np

Introduction to NumPy

Numpy is a library that provides multi-dimensional array objects. You can think of these somewhat like normal Python lists, except they have a number of qualities that make them better for numeric computations.

Let's try adding two lists together



In [2]:

    
x = [1,2,3]
y = [4,5,6]
x + y









    Out[2]:





[1, 2, 3, 4, 5, 6]

With Python lists the + operator appends them together. If we wanted to add these two lists elementwise we'd have to use a loop



In [3]:

    
z = [0]*len(x) # Generates a list of zeroes the same length as x.
for i in range(len(x)):
    z[i] = x[i] + y[i]
z









    Out[3]:





[5, 7, 9]

or a for comprehension



In [4]:

    
[i + j for (i, j) in zip(x, y)]









    Out[4]:





[5, 7, 9]

With Numpy arrays this isn't the case



In [5]:

    
xNumpy = np.array([1, 2, 3])
yNumpy = np.array([4, 5, 6])
xNumpy + yNumpy









    Out[5]:





array([5, 7, 9])

The + operator applied to Numpy arrays performs elementwise addition. -, * and / also apply elementwise. Using these operators makes it a lot easier to understand what's happening in the code.

The other advantaged of Numpy arrays has to do with performance. Let's perform elementwise multiplication of the first 1 million numbers divided by 3 and the first 1 million numbers divided by 7, that is:

[1/3, 2/3, ..., 999999/3, 1000000/3] * [1/7, 2/7, ..., 999999/7, 1000000/7]



In [6]:

    
def normal_multiply(x, y):
    return [i * j for i, j in zip(x, y)] 
        
def numpy_multiply(x, y):
    return x * y

x = [i/3. for i in range(1,1000001)]
y = [i/7. for i in range(1,1000001)]
xNumpy = np.array(x)
yNumpy = np.array(y)

Both of the functions perform the same operation, one using a Python for loop and the other taking advantage of Numpy arrays.



In [7]:

    
%timeit normal_multiply(x, y)









    



1 loop, best of 3: 235 ms per loop



In [8]:

    
%timeit numpy_multiply(xNumpy, yNumpy)









    



100 loops, best of 3: 3.81 ms per loop

The numpy_multiply function is significantly faster than the normal_multiply function, even though they both compute the same thing. The reason for this has to do with how Python lists and Numpy arrays are represented on the computer.

More information on Numpy can be found here